The SuperSID project: exploiting high-level information for high-accuracy speaker recognition
نویسندگان
چکیده
• This work is sponsored by the Department of Defense under Air Force Contract F19628-00-C-0002.and the CLSP/JHU workshop was supported by NSF and DoD fudning. Opinions, interpretations, conclusions and recommendations are those of the authors and are not necessarily endorsed by the United States Government + The authors gratefully acknowledge the CLSP group at JHU for organizing and hosting WS2002. ABSTRACT
منابع مشابه
Fusing high- and low-level features for speaker recognition
The area of automatic speaker recognition has been dominated by systems using only short-term, low-level acoustic information, such as cepstral features. While these systems have produced low error rates, they ignore higher levels of information beyond low-level acoustics that convey speaker information. Recently published works have demonstrated that such high-level information can be used suc...
متن کاملExploiting High-Level Information Provided by ALISP in Speaker Recognition
The best performing systems in the area of automatic speaker recognition have focused on using short-term, low-level acoustic information, such as sepstral features. Recently, various works have demonstrated that high-level features convey more speaker information and can be added to the low-level features in order to increase the robustness of the system. This paper describes a text-independen...
متن کاملشبکه عصبی پیچشی با پنجرههای قابل تطبیق برای بازشناسی گفتار
Although, speech recognition systems are widely used and their accuracies are continuously increased, there is a considerable performance gap between their accuracies and human recognition ability. This is partially due to high speaker variations in speech signal. Deep neural networks are among the best tools for acoustic modeling. Recently, using hybrid deep neural network and hidden Markov mo...
متن کاملDuration and pronunciation conditioned lexical modeling for speaker verification
We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using durationand pronunciation-conditioned word N-grams, simultaneously modeling lexical information along with their acoustic and prosodic characteristics. Support vector machines are used for modeling and scoring, with N-gram freque...
متن کاملRobust Speaker Recognition
The automatic speaker recognition technologies have developed into more and more important modern technologies required by many speech-aided applications. The main challenge for automatic speaker recognition is to deal with the variability of the environments and channels from where the speech was obtained. In previous work, good results have been achieved for clean high-quality speech with mat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003